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EDGE MODIFICATION CRITERIA FOR ENHANCING 
THE COMMUNICABILITY OF DIGRAPHS 

FRANCESCA ARRIGQt AND MICHELE BENZI* 


Abstract. We introduce new broadcast and receive communicability indices that can be used 
as global measures of how effectively information is spread in a directed network. Furthermore, we 
describe fast and effective criteria for the selection of edges to be added to (or deleted from) a given 
directed network so as to enhance these network communicability measures. Numerical experiments 
illustrate the effectiveness of the proposed techniques. 
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1. Introduction. The concept of network communicability , first introduced by 
Estrada and Hatano in [9], is being increasingly recognized as an important metric in 
the structural analysis of networks. The communicability between two nodes i and 
j is defined as the (z, j)th entry in the exponential of the adjacency matrix of the 
network (or some scaled version of it). This choice can be justified on graph-theoretic 
grounds based on the concept of walks in a graph, and also from a statistical physics 
point of view if we regard a network as a system of coupled oscillators and consider 
the associated thermal Green’s function [10]. To date, there have been a number of 
applications of communicability to the analysis of real-world complex networks, a few 
of which are surveyed in [10]. 

In [3] a new node centrality measure was introduced based on the notion of total 
communicability , which measures how easily a given node communicates with all the 
nodes in the network. As pointed out in [3], this centrality measure is closely related 
to the notion of subgraph centrality [11], while being much easier to compute in the 
case of large networks. In [3] it was also proposed to use the sum of all the total node 
communicabilities, possibly normalized by the number of nodes, as a global measure 
of how effective the network is at propagating information among its nodes. This 
global index, referred to as total network communicability, was further shown in [1] 
to provide a good measure of the connectivity and robustness of complex networks, 
while being much faster to compute than existing metrics (such as the closely related 
free energy [8] and natural connectivity [26, 27]). Indeed, the cost of estimating the 
total network communicability using Lanczos-type methods scales linearly in the size 
of the graph for many types of networks [3]. 

Given that high communicability is often a highly desirable feature (especially in 
the case of certain infrastructure, information, and social networks) it is then natural 
to ask whether it is possible to design networks which are at the same time highly 
sparse (in the sense of low average degree) and yet have high total communicability. 
(This problem is analogous to that of constructing good expander graphs, see [20].) In 
[1] we considered the problem of modifying an existing sparse network so as to cause 
the total network communicability to change in some desired way. The modification 
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can be the addition of a missing edge, the deletion of an existing edge, or the rewiring 
of an existing edge. The goal could be to increase the total communicability of the 
network as much as possible (or nearly so), or to sparsify the network while minimizing 
the drop in the value of the total communicability, subject to constraints on the 
number of edge modifications allowed. In [1], several fast and effective heuristics have 
been developed that achieve the desired goal. 

A serious limitation of the notion of total communicability is that it is not well 
suited to deal with directed networks, and indeed all the above mentioned papers deal 
exclusively with undirected networks. The main reason is that in a directed graph 
each node plays two roles, that of broadcaster and that of receiver of information. 
It is clear that a single index cannot discriminate between these two forms of com¬ 
munication. In this paper, building in part on the ideas in [2], we define two new 
measures of total network communicability, which quantify how easily information is 
propagated on a given directed network when the two fundamental modes of commu¬ 
nication (broadcasting and receiving) have both to be accounted for. Furthermore, 
we generalize the edge modification criteria in [1] from the undirected to the directed 
case, using the newly introduced communicability indices as the objective functions. 

Examples of real-world directed networks include various information and cita¬ 
tion networks, such as corpora of documents linked to each other by directed edges, 
for examples hyperlinks between web pages or in-line references between Wikipedia 
entries. In such networks, one may want to delete edges that contribute little to the 
overall authoritativeness of the network. In other cases, one may want to add edges 
from a hub to other nodes so as to increase the overall efficiency of the network in 
leading to authoritative documents. It is therefore of interest to introduce criteria 
for edge selection aimed at (approximately) optimizing the global communicability 
properties of directed networks. 

A few other authors have previously considered heuristics for edge manipulation 
in directed networks. In [28], edge modification criteria are introduced for tuning 
the synchronizability of a network, a property of interest in many settings. In [13], 
the authors have considered the potential impact of edge modification on epidemic 
dynamics on contact networks. It is quite possible that our edge selection criteria 
may find application in these contexts. 

The remainder of the paper is organized as follows. Section 2 contains some 
background notions about digraphs, the singular value decomposition, and network 
centrality measures. The new total communicability indices for digraphs are intro¬ 
duced in section 3. The edge updating/downdating problem is described in section 4. 
In section 5 we introduce the proposed heuristics for edge manipulation, and in sec¬ 
tion 6 we discuss the result of numerical tests (including timings) using four real-world 
directed networks. Conclusive remarks are found in section 7. 

2. Background. In this section we recall a few definitions and notations asso¬ 
ciated with graphs. Let Q = (V, £) be a graph with n = | V| nodes and m=\£\ edges 
(or links). If for all i,j £ V such that (i,j) £ £ then also (j, i) £ £, the graph is 
said to be undirected , as its edges can be traversed without following any prescribed 
“direction.” On the other hand, if this condition does not hold, namely if there exists 
(*,j) £ £ such that (j,i) £, then the network is said to be directed. A directed 

graph is commonly referred to as a digraph. If (i, j) £ £ in a digraph, we will write 
* —> j. As for the undirected case, an unweighted digraph can be represented by 
means of a binary matrix A £ R” xn whose entries (A),y = ct,j are nonzero if and only 
if (i,j) £ £■ An ordered pair (i,j) (jL £ will be called a virtual edge. 
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Every node i £ V in a digraph has two types of degree, namely the in-degree and 
the out-degree ; the first, denoted by di n (i), counts the number of edges of the form 
* —> i, i.e., the number of nodes in Q from which it is possible to reach i in one step. 
The out-degree , on the other hand, counts the number of nodes that can be reached 
from i in one step, i.e., the number of edges of the form * —> *, and is denoted by 
d 0 ut(i)- The degrees of a node i can be computed as the ith entries of the following 
two vectors: 

j *d Q ut - Al, 

\ d m = A T l = (l T A) T . 

Here 1 is the vector of all ones and the superscript “T” denotes transposition. 

A walk of length k is a sequence of (possibly repeated) nodes *i, * 2 , • • •, ik+i such 
that ii —> for all Z = 1,..., k; a walk is said to be closed if i\ = ik+i- A path is a 
walk with no repeated nodes. A digraph is said to be strongly connected if every two 
nodes in the network are connected through a path of finite length, while it is said 
to be weakly connected if this property holds when the directionality of the links is 
disregarded. 

Unless otherwise stated, every digraph in this paper is simple , i.e., unweighted, 
weakly connected, and without self-loops or multi-edges. 

Let A = UYiV T be a singular value decomposition (SVD) of the adjacency matrix 
A [21]. The matrix E £ R nxn is diagonal and its diagonal entries (E)jj = cq are the 
singular values of A. These elements are non-negative and ordered as 

<7i > (72 > • • • > u r > oy_|_i = • • • = a n = 0, 

where r = rank(A) is the rank of A. The matrices U, V £ M raxrl are orthogonal and 
U = [m, U 2 ,..., u„] contains the left singular vectors of A, while V = [vi, V 2 ,..., v„] 
contains the right singular vectors. As is well known, E is uniquely determined by 
A but U and V are not. Given an SVD of A, the corresponding compact singular 
value decomposition (CSVD) of the matrix A is given by A = U r Ti r VfF , where U r = 
[ui, u 2 ,..., u r ] £ R" XT ' and V r = [v 1; v 2 ,..., v r ] £ R" xr consist of the first r columns 
of U and V, respectively, and E r = diag(or, <72, ■ ■ ■, oy) £ R rxr corresponds to the 
leading r x r diagonal block of E. 

2.1. Hubs and authorities. Let us briefly recall here a few definitions con¬ 
cerning the dual role every node plays in a digraph. In [22] Kleinberg stated that 
in directed networks there exist two types of important nodes: hubs and authorities. 
In particular, each node can be assigned a hub score and an authority score, which 
quantify its ability of playing these two roles. Good hubs are those nodes which 
better broadcast information, while good authorities are those which better receive 
information. These two types of importance for nodes are strongly related through 
a recursive definition: the importance of a node as hub is proportional to the impor¬ 
tance as authorities of the nodes it points to. Similarly, the importance of a node 
as authority depends on the importance as hubs of the nodes that point to it. This 
recursive definition is highlighted in the implementation of the HITS algorithm (see 
[22]), which makes use of the eigenvectors corresponding to the leading eigenvalue of 
the symmetric matrices AA T (the hub matrix) and A T A (the authority matrix) to 
rank the nodes as hubs and authorities, respectively. 1 

1 For simplicity, unless otherwise specified, in this and the next section we assume that the 
dominant eigenvalue of AA T (and therefore of A T A) is simple. This ensures the uniqueness (up 
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Using the SVD or the CSVD of the adjacency matrix, it easily follows that AA T = 
UY> 2 U T = U r T, 2 Uf and A T A = VY, 2 V T = V r T, 2 Vf r . Therefore, the vector containing 
the hub scores is Ui while the vector containing the authority scores is vi. By the 
Perron-Frobenius theorem [21], from the non-negativity and irreducibility of the hub 
and authority matrices it follows that these principal eigenvectors can be chosen so 
as to have positive components. Hence, ui > 0 will be called the hub vector and the 
vector vi > 0 will be called the authority vector. 

The powers of the hub and authority matrices are related to the number of par¬ 
ticular types of walks in the digraph. Following [2, 6], we define an alternating walk of 
length k starting with an out-edge as a list of nodes i\, Z 2 ,..., ik+i such that there ex¬ 
ists an edge (ii,ii+ i) if l is odd and an edge (ii+i,ii) if l is even. Hence, an alternating 
walk starting with an out-edge has the form 


*i 


*2 


^3 


Similarly, an alternating walk of length k starting with an in-edge is a list of nodes 
*ii * 2 ) • • • j ik+i such that 




12 


«3 


i.e., such that there exists an edge (ii,ii+ 1 ) if l is even and an edge [ii+i, ii) otherwise. 

It is well known that the entries of powers of the adjacency matrix of a graph can 
be used to count the number of walks of a certain length in the network. Similarly, it is 
known (see, e.g., [6]) that [AA T A.. .]^ (where there are k matrices being multiplied) 
counts the number of alternating walks of length k, starting with an out-edge, from 
node i to node j, whereas [A T AA T .. ,]ij (where there are k matrices being multiplied) 
counts the number of alternating walks of length k, starting with an in-edge, from 
node i to node j. Thus, [(AA T ) k ]ij and [{A T A) k \ij count the number of alternating 
walks of length 2k. 

In the next section, we will show how to use these quantities to define two global 
measures of how effectively the nodes in a digraph exchange information. 

3. Total network communicabilities for digraphs. In [3] a global measure 
of how easily information is diffused across an (undirected) network has been defined 
in terms of the matrix exponential of the adjacency matrix. More in detail, recalling 
that the entries of the matrix exponential count the total number of walks of any 
length between two nodes weighting walks of length k by a factor dy, the total network 
communicability has been defined as the sum of all the entries of this matrix: 

n n oo 1 

(3.1) TC{A) ■= EEM, = i'A e A = E^‘- 

2=1 j —1 k —0 

This quantity, possibly normalized by n, has been empirically shown to provide a 
good measure of how effectively the information flows along the network and of how 
well connected an undirected network is (see [1, 3]). 

Let now A be the adjacency matrix of a directed graph. In analogy with the 
undirected case, we can consider the total network communicability (3.1). In principle, 
this quantity (possibly normalized by n) gives us an idea of how efficient the network 


to scalar multiples) of the principal eigenvectors of these matrices, and therefore of the hub and 
authority rankings. We refer the reader to [12] and to [23, page 120] for a discussion of this issue. 
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is, globally, at diffusing information. However, by following this approach we would 
be completely disregarding the twofold nature of nodes, which is one of the main 
features of digraphs. 

To better capture the dual behavior of nodes, we introduce two new global indices 
of communicability defined in terms of functions of the hub and authority matrices. 

Definition 3.1. Let A be the adjacency matrix of a simple digraph and let 
f : R —y R be a function defined on the spectrum of AA T . The total hub /- 
communicability of the digraph is defined as 

n 

T h C(AJ) := l T f{AA T )l = £ /(«r?)(l r u0 2 . 

2=1 

Similarly, the total authority /-communicability of the digraph is defined as 

n 

T a C(A , /) := l T f(A T A)l = £ f(af)(l T v t ) 2 . 

i=l 

The motivation for using these quadratic forms as total communicability indices 
is that they exploit the recursive definition that relates hubs and authorities in a 
directed network. Assume that the function / can be expressed as a power series of 
the form 

OO 

(3.2) f(t) = Y J c k t k , Cfc > 0 Vfc = 0,l,... 

k -0 

Then, an easy computation shows that the total hub /-communicability, can be de¬ 
scribed in terms of the in-degree vector and of the authority matrix as 

OO 

T h C(A, f) = c 0 n + d||di„||l + ^2 c/c+id^(A T A) fc d in , 

k=l 

thus highlighting the fact that the overall ability of nodes to broadcast information 
depends on their ability of receiving it. Note that due to the nonnegativity assumption 
on the coefficients in (3.2), ThC(A, f) is an inherently nonnegative quantity. 

Analogous computations carried out on the total authority /-communicability 
show that this index can be completely described in terms of the out-degree vector 
and of the hub matrix: 


T a C(A,f) = c 0 n + ci||d out ||2 + y^c fc+ id^ t (AA T ) fc d out , 

fc=i 

thus showing that the overall ability of nodes to receive information depends on how 
well they are able to broadcast it. Note that T a C(A 7 f) is, again, always nonnegative. 

Remark 1. We stress that the total hub and authority /-communicabilities are 
invariant under graph isomorphism. Indeed, let Q\ and Q 2 be two isomorphic graphs 
with associated adjacency matrices A\ and A 2 . Then there exists a permutation 
matrix P such that A 2 = PAiP T . Therefore, 

T h C(A 2 ,f) = l T f{A 2 A T 2 )l = l T f(PA 1 P T PA T 1 P T )l 

= l T Pf(A 1 A^)P T l = l r /( J 4 1 Af )1 = T h C(Ai, f). 
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Similarly, 

T a C(A 2 ,f ) = l T f(A T 2 A 2 )l = 1 T P/(4X)P T 1 
= l T f(A'[A 1 )l = T a C(A 1 ,f). 


In this paper we will focus on the total hub and authority /-communicabilities 
when the function fit) = cosh(v^) is used in the definition. The choice of the function 
f(t) may seem unusual; however, we argue that this choice is the most natural one 
if one wants to “translate” the idea of total communicability to the case of digraphs. 
Indeed, in the undirected case the total communicability was defined as the sum of all 
the entries of the matrix exponential. This index counts all the walks of any length 
taking place in the network, weighting walks of length k by a factor -jy. In the case of 
a digraph, we need to count all the alternating walks, again penalizing longer walks. 
This is accomplished by taking f{t) = cosh(-\/t); for this choice of / we obtain the 
total hub communicability as 


T h C{A) := 1 T 



(AA T ) fc \ 

12 W J 


1 = 1 T 



(VAAT) 2k \ 

(2k)! J 


= l 7 cosh(\/ AA T ) 1 = ThC(A, cosh(\/t)), 


1 


and, similarly, the total authority communicability as 


T a C(A) := 1 T 



(A T A) k \ 

"W" J 


1 = 1 T 



{VWI) 2k \ 

(2 k)\ ) 


= l 7 COS h(V A T A)1 = T a C(A, cosh (Vi))- 


1 


A further justification for the choice of the function f(t) comes from considering 
the following construction (see [2]). Let 



be the adjacency matrix of the bipartite graph Sf = ("P, S) obtained from the original 
digraph represented by A. This graph has 2 n nodes forming the set P = V U V', 
where V is the original set of nodes and V = {1' = n + 1,2' = n + 2,..., n' = 2n} 
is a set of copies of the nodes in Q = (V, £). The edges between the elements in P 
are undirected and (*,/') £ £ with i £ V and j' £ V' if and only if (i,j) £ £ in the 
original digraph. 

Note that in the bipartite graph the first n nodes contained in V can be seen 
as the original nodes of the digraph when they play their role of broadcasters of 
information, while the n copies contained in V' represent the original nodes in their 
role of receivers. It is worth mentioning that the eigenvector of corresponding 


to the leading eigenvalue Ai (&/) = cy is the vector qi 


Ul 

Vi 


The choice of 


f(t) = cosh(-\/t) follows from the next result. 

Proposition 3.2. [2, Proposition 1] Let be as in (3.3) and let A = UYiV T be 
an SVD of A. Then 


(3.4) 


e' 


cosh(\/AA T ) U sinh(S)P T 
V sinh(E)C/ T cosh(-\/ A T A) 
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An important feature of this matrix is that its entries are nonnegative. Thus, these 
quantities can be used to describe the importance of nodes and how well they commu¬ 
nicate when they are acting as broadcasters or receivers of information in the graph [2]. 
Indeed, the entries of the two diagonal blocks cosh(V AA T ) and cosh(-\/ A T A) provide 
centrality and communicability indices for nodes and pairs of nodes when they are 
all seen as playing the same role in the network. More in detail, the diagonal entries 
of the first diagonal block give the centralities for the nodes in the original network 
when they are seen as broadcasters of information (hubs). Likewise, the diagonal 
of the second block contains the centralities for the nodes in their role of receivers 
(authorities). Similarly to the off-diagonal entries of the matrix exponential of an 
undirected graphs, the off-diagonal entries of these diagonal blocks measure how well 
two nodes, both acting as broadcasters (resp., receivers), exchange information. 

As for the off-diagonal blocks in (3.4), they contain information concerning how 
nodes exchange information when one node is playing the role of broadcaster (resp., 
receiver) and the other is acting as a receiver (resp., broadcaster). 

Thus, the total hub communicability and total authority communicability de¬ 
fined as ThC(A) = l 1 cosh(\/ AA T ) 1 and T a C(A) = l 7 cosh(\/A T A)l, respectively, 
account for the overall ability of the network of exchanging information when all its 
nodes are playing the same role of broadcasters ( ThC(A )) or receivers (T a C(A)). 

4. Edge modification strategies. The main goal of this work is to develop 
heuristics that can be used to add/remove edges from a digraph in order to tune 
the total hub and/or authority communicability. In particular, we will call update of 
{i,j) ^ £ the addition of this virtual edge to the network; we want to perform this 
operation in such a way that this addition increases as much as possible the quantities 
of interest. Note that, due to the nonnegativity condition in (3.2), the addition of an 
edge can only increase the total communicabilities ThC(A) and T a C(A). 

The operation of removing an edge from the network will be referred to as the 
downdate of an edge. Our aim is to select the edge to be removed in such a way that 
the target functions ThC and T a C are not penalized too much, i.e., their values do not 
drop significantly as edges are removed. 2 Both these operations can be described as 
rank-one modifications of the adjacency matrix A of the digraph Q or, equivalently, as 
rank-two modifications of the adjacency matrix si of the associated bipartite graph 
( S. 

We first introduce some edge centrality measures that can be used to rank the (vir¬ 
tual) edges in the digraph; we then use the derived rankings to select which modifica¬ 
tions to perform. More in detail, a virtual edge having a large centrality is considered 
important and thus its addition is expected to highly enhance the total communica¬ 
bilities. On the other hand, we will remove edges that have a low ranking, since they 
are not expected to carry a lot of information; thus, their removal is not expected to 
heavily penalize the hub and authorities communicabilities of the network. 

The resulting updating and downdating strategies will be similar in spirit to those 
adopted in the undirected case [1]. However, as explained in more detail in the next 
subsection, we cannot simply apply the heuristics in [1] to the bipartite graph since 
doing so could lead to possible loss of structure. 


2 Clearly, our approach can be adapted so as to obtain the opposite effect if so desired. Indeed, 
we can adapt our algorithms to select edges whose removal heavily penalizes the target functions. 
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4.1. Bipartite graphs vs. digraphs. In this section we will describe two dif¬ 
ferent ways of tackling the problem of selecting K edge modifications to be performed 
on the network in order to tune the communicability indices ThC(A) and T a C(A). 

First we describe how to rank the edges. A priori, there are two natural ap¬ 
proaches. Indeed, given the definitions of communicabilities in terms of the function 
f(t) = cosh( v / t), we can either work on the matrix &/ or on the original adjacency 
matrix A. In the first case, we would adapt to the matrix srf the techniques devel¬ 
oped for the undirected case which performed best according to the results in [1], 
taking into account the need to preserve the zero-nonzero block structure of srf . The 
second approach, on the other hand, requires the introduction of new edge centrality 
measures specially developed for the directed case. 

We will show that the new edge centrality measures for digraphs allow us to 
develop heuristics that perform as well as or better than the techniques for undirected 
graphs applied to srf. 

We want to stress here that the set of (virtual) edges among which we select the 
modifications is the same in both cases, since one wants to preserve the antidiagonal 
block structure (3.3) of srf. Indeed, if a new edge were to destroy the structure, it 
could not be “translated” into a new directed edge for the original digraph. 

4.2. Edge centralities: undirected case. In the following, we will briefly 
recall the edge centrality measures introduced in [1] that showed the best performance. 
These will be used on srf to tackle the updating and downdating problems. 

Let M be the adjacency matrix of a simple, undirected graph. We call the edge 
eigenvector centrality of the (virtual) edge (i,j) the quantity: 

e £C{i,j) = qi(i)qi(j), 

where qi(i) is the ith entry of the Perron vector qi of the matrix M (see [21, 5]). We 
call edge total communicability centrality of (i,j) the quantity: 

e TC^j) = (e M l)i(e M l)j. 

Let Ai > A 2 > • • • > A ra denote the eigenvalues of M. It has been pointed out 
that, when the spectral gap Ai — A 2 is large enough, then these two centrality measures 
provide very similar rankings, especially when the attention is restricted to the top 
edges; on the other hand, different rankings may be obtained when the gap is small 
[1, 4]. 

4.3. Edge centralities: directed case. We now want to define two new edge 
centrality measures that take into account the directionality of links and that can be 
computed by directly working on the unsymmetric adjacency matrix A. 

In [1] it has been pointed out that one of the main factors in the evolution of 
the total communicability is the dominant eigenvalue Ai of the matrix involved in its 
computation. This is clear since for an undirected graph with adjacency matrix A the 
total communicability can be expressed as 

n 

TC{A) = ^e Ai a?, a i = l T x l , 

i=l 

hence the dominant contribution to TC(A) comes from the first term of the sum. 
Thus, heuristics that increase the spectral radius of A as much as possible will likely 
be effective also when the goal is to increase the total communicability as much as 
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possible. For example, one of the methods found in [1] to have the best performance 
relies on the edge eigenvector centrality, which is indeed directly connected to the 
change that occurs in the magnitude of the leading eigenvalue (see [1] for more details). 

Transferring this idea to ThC(A) and T a C(A ), it follows that we want to define (if 
possible) an edge centrality measure that allows us to control the change in the leading 
singular value of A, which corresponds to the square root of the leading eigenvalue of 
AA T and A T A. 

Proposition 4.1. Let A be the adjacency matrix of a graph. Let Ui and Vi 
be the hub and authority vectors, respectively. Let <ji be the leading singular value 
of A. Consider the adjacency matrix of the graph obtained after the addition of the 
virtual edge ( i,j ): A = A + e^e J. Then the leading eigenvalue of the new hub and 
authority matrices satisfies 

(4.1) a\ > a\ + 2aiui(i)vi(j) +max{ui(i) 2 ,t;i(j) 2 } . 

The inequality is strict if AA T is irreducible. 

Moreover, let A = A—e,;ej denote the adjacency matrix obtained after the removal 
of the existing edge i —> j. Then the leading eigenvalue a 2 of the new hub and authority 
matrices satisfies 

(4.2) a\ > a\ > a 2 - 2o 1 u 1 (i)v 1 {j)+max{u 1 (i) 2 ,v 1 (j) 2 } . 

The first inequality is strict if AA T is irreducible. 

Proof. 

Using the Rayleigh Ritz Theorem (see, for example, [21]) we get: 

d\ = Ai (AA t ) = max z T (AA T \ z 
IH| 2 =i V ' 

> uj (^ T ) Ui 

= ll(^ T + e i e D u i II2 
= Ikivi +u l {i)e j \\l 
= al + 2a 1 u 1 (i)v 1 (j) + ui(i) 2 . 

Similarly, by working on the authority matrix one gets: 

= Ai (A t A) > vf vi = a\ + 2a 1 ui(i)v 1 (j) + v\(j) 2 . 

From these inequalities, and from basic facts from Perron-Frobenius theory, the con¬ 
clusion easily follows. Similar arguments can be used to prove (4.2). □ 

Relations (4.1) and (4.2) motivate the following definition. 

Definition 4.2. Let A be the adjacency matrix of a directed graph. Let Ui and 
Vi be its HITS hub and authority vectors, respectively. Then the edge HITS centrality 
of the existing/virtual edge ( i,j ) is defined as 

e HC(i,j) = ui(i)uiO'). 


Notice that when A is symmetric this definition reduces to that of edge eigenvector 
centrality: e EC(i,j) = x\(i)x\(j), where xi is the eigenvector associated with the 
leading eigenvalue of A. 

Remark 2. Inequalities (4.1) and (4.2) and, consequently, definition 4.2 suggest 
that there is a “prescribed direction” one has to follow when introducing a new edge 
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centrality measure. Indeed, it is required to use the centrality as broadcaster for the 
source node i and the centrality as receiver for the target node j when evaluating the 
importance of the (virtual) edge i j. This observation confirms a natural intuition 
and motivates the usage of this same “orientation” in all our definitions and methods 
(cf. section 5). 

The next edge centrality measure we want to define relies on the use of the 
total communicability of nodes. Recall that in the case of an undirected network 
represented by the symmetric adjacency matrix M, the total communicability of node 
i is defined as This quantity describes how well node i communicates with 

the whole network. As discussed in section 3, this centrality measure is well defined 
for any adjacency matrix, in particular for the adjacency matrices of digraphs, and 
indeed the row and column sums of e A do provide in some cases meaningful measures 
of how well nodes broadcast information (row sums of e A ) and how good they are 
at receiving information (column sums of e A ). However, the expressions describing 
these quantities do not provide information on the alternating walks taking place in 
the digraph and, thus, miss a crucial feature of communication in real-world directed 
networks. 

For this reason, we introduce here new definitions for the total communicabilities 
of nodes which can be shown to be directly connected to their twofold nature. In order 
to do so, we make use of the concept of generalized matrix function first introduced 
in [18]. Let A = U r T, r V,F £ R” xn be a matrix of rank r, and let / : R. —> R be 
a function such that /(eq) exists for all i = 1,2,..., r, so that the matrix function 
/(£ r ) = diag(/(cri), /(cr 2 ), ■ ■ ■, f(cr r )) is well defined. 

Following [18], we define the generalized matrix function : R" xrl —> R" xn as 

r 

f°(A) = u r fp r )v r T = • 

1 

It is easy to check that 

These equalities show that a generalized matrix function can be expressed in 
terms of A and either AA T or A T A. Therefore, the entries of /°(A) — and hence 
its row/column sums — can be used as meaningful measures of importance in the 
directed case, provided that they are all non-negative. 

It turns out that, in general, this is not the case for the generalized matrix ex¬ 
ponential. Indeed, consider for example the generalized matrix exponential of the 
adjacency matrix 




( o 

0 

1 



(4.4) 

A = 

1 

0 

0 

1 

0 

0 

i 

0 




l 0 

1 

0 

0 ) 


associated with the digraph 

in Fig 

1 . 

It 

turns out that its (3,1) and (4,4) en 


tries are negative, and thus these quantities cannot be interpreted as communicabil¬ 
ity/centrality measures. 
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Fig. 1. The digraph associated with adjacency matrix described in (4.4). 


If we instead consider the generalized hyperbolic sine: 

r 

sinh 0 (H) = U r sinh(E r )U r T = ^ sinh(crfc)ufcV^, 

fc= l 


we have that this matrix corresponds to the top right block of the matrix ; indeed, 
we can rewrite equation (3.4) as 


(4.5) 


cosh(\/kL4 T ) sinh°(j4) \ 
sinh°(H) T cosh(-\/ A T A) J 


Hence, the entries of sinh°(A) are all non-negative, and can be used to quantify 
how well nodes communicate when they are playing different roles. More precisely, 
reasoning in terms of alternating walks shows that the (i,j)th entry of this matrix 
describes how well node i exchanges information with node j when the first is playing 
the role of hub and the latter that of authority. Using this generalized matrix function 
we can introduce two new centrality measures for nodes in digraphs. 

Definition 4.3. Let A = [/ r S r U r T be the adjacency matrix of a directed network. 
We call total hub communicability of node i the quantity 

r 

Ch(i) = ef sinh <> (A) 1 = ^sinh(cr fc )(1 )u k (i) 

k =1 


and total authority communicability of node j the quantity 

r 

C a (j) = 1 T sinh <> (H)e j = ^ smh(o k )(ull)v k (j) 

k= 1 


These quantities correspond to row or column sums of the off-diagonal block 
of e^; therefore, Ch{i) quantifies the ability of node i — playing the role of hub 
— to communicate with all the nodes in the network, when they are all acting as 
receivers of information. Similarly, C a (j) accounts for the ability of node j as an 
authority to receive information from all the nodes in the graph, when they are acting 
as broadcasters of information. 3 This feature highlights the fact that these definitions 


3 The reader is referred once again to [2] for a more detailed discussion of the interpretation of 
the entries in the off-diagonal blocks of . 
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Table 1 

Centrality measures for the nodes in the graph represented in Fig. 1 and described by the 
adjacency matrix (4.4). 


NODE 

dout (i) 

din (0 


vi (i)' z 

c h {i) 

C a (i) 

1 

i 

i 

.0000 

.3333 

1.1752 

1.3683 

2 

2 

2 

.5000 

.3333 

2.7366 

2.7366 

3 

1 

1 

.2500 

.0000 

1.3683 

1.1752 

4 

1 

1 

.2500 

.3333 

1.3683 

1.3683 


are better suited than e A l and (l 7 e A ) T when it conies to working on digraphs. This 
result is summarized in the following proposition. 

Proposition 4.4. Let A be the adjacency matrix of a graph Q = (V,£). The 
total hub communicability of node i € V can be written as 


(4.7a) C h (i) = smh -( CTfc ) e f (u k ul)d out 


k= 1 






k =1 


& k 


lev 

i—>£ 


Similarly, the total authority communicability of node j € V can be expressed as 


(4-7b) CM = £ (V1V T) = f- EdlfiV Ul) £„ iW . 

L ' (T 1 Z ' /Tj. Z * 


/c=l 


^ k 


k= 1 




^<EV 


Proof. Using the first equality in (4.3) one gets that: 
C h (i)=e[ 


sinh(crfc) T \ sinh(cr fc ) T , T 

2^-UfeU 7 Ul = 2^ ——- e i (UfcU 7 ) d out , 


Kk=l 


O k 


k=l 


& k 


which proves the first equality of (4.7a). To prove the second, we apply the second 
equality in (4.3): 


C h (i) = (ef A) 


sinh(crfc) T 

-v fc v fc 

CTk 


1 = 


sinh (cr fc ) 


fc=l 


& k 


(vll)Ai.v k , 


where A, , is the ith row of the adjacency matrix A. The conclusion then follows from 
the fact that A i : v k = v k{£)- The proof of (4.7b) goes along the same lines and 

is thus omitted. □ 

Before proceeding with the introduction of the associated edge centrality measure, 
we want to show with a small example that these measures of hub and authority 
centrality are indeed informative. Consider as an example the graph in Fig. 1. It is 
intuitive that node 2 should be given the highest score both as hub and as authority 
by any reasonable centrality measure. Consequently, the authority scores for nodes 
1 and 4 should be the same and higher than that of node 3 because these nodes are 
directly pointed to from node 2, which is the best hub in the graph. For a similar 
reason, nodes 3 and 4 should be ranked higher than node 1 when considering a hub 
score, since they directly point to node 2, which is the most important authority. 

Table 1 contains the centrality scores for the four nodes when the in/out-degree, 
HITS centrality 4 , and the total hub/authority communicability are considered. Clearly, 


4 To compute these scores, we initialize the HITS algorithm with the constant authority vector 
with 2-norm equal to 1; see [22, 2]. 
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the in/out-degrees of the nodes do not capture the picture we just described since 
they cannot discriminate between nodes 1, 3, and 4. This happens because the de¬ 
gree centralities take into account only local information about how nodes propagate 
information in the network. 

Concerning HITS, the rankings given by the hub scores conform to our expecta¬ 
tions, but those given by the authority scores do not, since they are unable to identify 
node 2 as the most authoritative one (it is tied with nodes 1 and 4). Another problem 
with HITS is that the rankings will depend in general on the initial vector, since for 
this example the matrices AA T and A T A are reducible (this also explains the occur¬ 
rence of zero entries in the hub and authority vectors). Note that this is a non-issue 
for both Ch{i) and C a (i ); most importantly, however, these two measures succeed in 
identifying the “correct” relative rankings for the hubs and authorities in this digraph. 

These observations motivate the introduction of a new edge centrality measure. 

Definition 4.5. Let A be the adjacency matrix of a simple digraph. Then the 
edge total communicability centrality of the existing/virtual edge (i,j) is defined as 

e gTC(i,j) = C h {i)C a (j), 

where Ch{i ) and C a (j) are the total hub communicability of node i and the total 
authority communicability of node j, respectively. 

Note that when the difference between the two largest singular values ay — 
a 2 is “large enough”, the quantities Ch(i) and C a (j) are essentially determined by 
sinh(cri)||v 1 || 1 ui(i) and sinh(er 1 )||u 1 || 1 i; 1 (j), respectively. When this condition is sat¬ 
isfied we expect agreement between the rankings for the edges provided by the edge 
HITS and total communicability centrality measures, at least when the attention is 
restricted to the top ranked edges. 

It is natural to ask how the edge centrality measure just introduced is related to 
the edge total communicability centrality applied to the undirected graph Sf. For the 
centrality of the (virtual) edge (i,j) we obtain 

(4.8) e TC(i,j') - [ e gTC(i, j)\ = - (cosh(VAA T )l) (cosh(VA T A)l) 

where e TC(i,j') is the edge total communicability of (■ i,j') in the bipartite graph Sf, 
j' = j + n, and 

= (e^l)i ^cosh(\/ A T A)^}j + (e^l)y ^cosh(V AA T ) 1^) . 

The difference in (4.8) is positive and it may be so large that the edge selected when 
working on the digraph could well be different from that selected when working on the 
associated bipartite network, thus leading to different results for the two techniques. 
As we will see in the section on numerical experiments, the two criteria may indeed 
lead to different results. 

Remark 3. Concerning the actual computation of the quantities that occur in 
Definition 4.3, one can either exploit the relationship (4.5) between e^ and sinh°(A) 
and use standard methods for computing the matrix exponential [19] or, if the matrix 
A is too large to build and work with srf explicitly, one can obtain estimates of the 
quantities of interest using the Golub-Kahan algorithm [15, 16]. Indeed, sinh°(A) can 
be rewritten as 


sinh°(A) = sinh(v / AA T )(VAA 1 ")^A = A(VA T A)^ sinh(v / A T A) 
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where “f” denotes the Moore-Penrose pseudoinverse, and one can obtain estimates of 
the desired row and column sums by applying Golub-Kahan bidiagonalization with an 
appropriate starting vector (Al or A T 1. respectively). We plan to investigate these 
and other computational issues in future work. The test matrices used in this paper are 
small enough that we could form and manipulate the matrix stf explicitly. Therefore, 
we expect the heuristics based on the two edge centrality measures e gTC(i, j) and 
e TC(i,j') to perform similarly in terms of timings. 

5. Heuristics. In this section we describe the methods we will use to perform 
the numerical tests presented in section 6. For both the updating and downdating 
problem, we will first rank the (virtual) edges using a variety of edge centrality mea¬ 
sures; for large graphs we may consider only a subset of all possible candidate edges, 
as discussed below. For the updating problem, we will then select the top ranked 
virtual edges, while for the the downdating problem we will select the edges having 
the lowest centrality rankings. Given a budget of K modifications to be performed, 
we can proceed in one of two ways. We can either perform one edge modification at 
a time and then recalculate all the necessary centrality scores right afterwards, or we 
can perform all the modifications at once, without recalculation. This latter approach 
will correspond to the . no variants of the algorithms. In the undirected case, the lat¬ 
ter approach was found to be essentially as effective as the former (even for relatively 
large K) while being dramatically less expensive in terms of computational effort; see 
[!]• 

As we already mention in section 4.1, we can either work on the bipartite net¬ 
work associated with the digraph or directly on the original network. When work¬ 
ing on the original graph, addition/deletion of an edge corresponds to rank-one up- 
dates/downdates to the corresponding adjacency matrix A. 

The methods used are labeled as follows: 

• eig(.no). Let xj be the right eigenvector associated with the leading eigen¬ 
value of A (assumed to be simple) and yi be the left eigenvector associated 
with the same eigenvalue. Generalizing the definition for the edge eigenvector 
centrality given in section 4.2, we can define in the case of digraphs: 

e EC(i,j) := Xi(i)yi(j). 

This quantity has been recently used in [24] to devise algorithms aimed at 
increasing as much as possible the leading eigenvalue of A when edges are 
added to the network. 

• TC( .no). Here we use the total communicability e A l. The score assigned to 
a (virtual) edge (*, j) is: 

e TC{i,j) := (e A l).(l T e A ).. 

This heuristic generalizes to digraphs the analogous one for undirected graphs 
(cf. nodeTC(.no) in [1]). 

• HITS (.no). Each (virtual) edge is given a score in terms of the quantities 
introduced in Definition 4.2: 

e HC(i,j) = wi(i)ui(j) 

• gTC(.no). This heuristic is based on the edge total communicability defined 
in terms of the generalized hyperbolic sine (see definition 4.5). The (virtual) 
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edge (i. j) is assigned the score: 

e gTC(i,j) = C h (i)C a (j), 

where Ch{i) = (sinh°(A)l) i and C a (j) = (sinh°(A T ))j. 

The first two methods (with their variants) generalize to the case of digraphs the 
techniques which performed best in the undirected case. Notice that we have used 
the broadcaster score for the source node and the receiver score for the target node 
(see Remark 2). 

Next, we consider the bipartite network associated to the matrix defined in 
(3.3). The criteria we use to select the modifications are based on the edge centrality 
measures described in section 4.2. We will label the methods as follows: 

• b: eig( .no). We use the eigenvector centrality of edges; the edge eigenvector 
centrality of the (virtual) edge (i,j') is defined as 

e £C(i,j r ) = qi{i)qi(j'), 
where qi is the Perron vector of £/. 

• b:TC(.no). This is based on the total communicability centrality of edges: 
each (virtual) edge ( i,j ') is assigned the score: 

e TC(iJ')=(e^ l).(e*l). # . 

• b:deg. This simple heuristic is equivalent to the degree method in [1]. Each 
(virtual) edge is assigned a score of the form: 

d(i) + i £ V and j' £ V', 

where d(i) = is the degree of node i in the network represented by . 

Remark 4. We do not provide a method that generalizes degree in [1] to the case 
of digraphs since it would coincide with the heuristic b: deg just introduced. Indeed, 
the straightforward generalization would require to assign to the (virtual) edge i —> j 
the score d ou t{i) + di n (j). However, it is easy to see that d ou t{i) = d(i) where i £ V 
and di n (j) = d(j') where j' £ V', and thus this technique would be indistinguishable 
from b: deg. Note that this technique is the optimal one if we want to optimize 
the sum ThC{A) + T a C{A) and we use the second order Maclaurin approximations 
cosh(\/X) ss / + y , with X = AA T , A T A to compute the the total hub and authority 
communicabilities. 

When working on the matrix associated with the bipartite graph, each edge mod¬ 
ification of the corresponding network will cause a rank-two change in srf . We want 
to stress once again that the set of (virtual) edges among which to select the mod¬ 
ifications is the same whether we work on A or on stf and corresponds to the set 
of (virtual) edges of the graph Q , or a subset of it. For large networks, the set of 
virtual edges among which to select the updates may be too large to be exhaustively 
searched. In this work we used the whole set for all the networks used in the exper¬ 
iments except the largest one, namely cit-HepTh (see table 2). For this problem, we 
restrict the search to a subset of the set of all virtual edges constructed as follows. We 
first rank in descending order the nodes of Sf using the eigenvector centrality. This 
results in a ranking of 2 n elements: the nodes in V and their copies. Next, for each 
i = 1,..., n we remove from the list the one element between i and its copy i' which 
has the lowest rank. We now have a list of length n which includes either one element 


16 


Francesca Arrigo and Michele Benzi 


(element of V) or its copy (element of V'). We thus relabel all the copies, if present, 
with the label of the corresponding node in V. The resulting list contains all the n 
nodes in the original graph. It has been obtained considering, for each node, its best 
performance between its role as hub and its role as authority in the network. Finally, 
we take the induced subgraph corresponding to the top 10% of the nodes in this list. 
The set of virtual edges in this subgraph is the set we exhaustively search. 

5.1. Rank-two modifications. Before discussing the results obtained by ap¬ 
plying our techniques to select rank-one updates of the matrix A, we want to briefly 
discuss how these techniques may be modified in order to make them suitable to se¬ 
lect symmetric rank-two modifications of the unsymmetric adjacency matrix. This 
approach goes beyond the scope of this paper, but it is worth some discussion. In¬ 
deed, in real world applications one may conceivably want to add (or delete) two- 
directional edges between nodes in a digraph in order to tune its total communi¬ 
cabilities. In this setting, the downdating and updating problems aim at the same 
goals as before, but the sets in which one searches for modifications are different from 
those used in our original problems. Indeed, the updates will be selected in the set 
{(*, j) S Vx V|(i, j), ( j,i) ^ £ }, while the downdates will be selected among the edges 
in £ V x V\(i,j), (j, i) € £}. 

We start by discussing the case of the degree and of the edge HITS centrality. 
The results obtained for these two approaches will motivate the generalization of the 
other techniques. As we have observed in Remark 4, the degree strategy works as 
the optimal strategy when we consider a second order approximation of the terms 
in the sum ThC(A) + T a C(A). By carrying out the same computation, replacing a 
rank-one update of the adjacency matrix with a rank-two update, one finds that the 
most natural generalization requires that the quantities used to rank the (virtual) 
edges by the method based on the degree of nodes are 

[din (i) + d ou t(j)] + [ d ou t[i ) + di n (j)]. 

A similar results can be obtained if we want to adapt HITS to handle rank-two updates. 
Indeed, to rank the undirected (virtual) edges one may use the quantities 

e HITS(i,j) + e HITS(j,i). 

This follows from the application to the matrices (A + e^eJ + ejef XA + e^eJ + ejef ) T 
and (A + ejej + ejef ) T (A + + ejef) of the same techniques used in the proof 

of Proposition 4.1. 

From these simple results, it follows that the quantities used by the other heuris¬ 
tics to handle rank-two modifications of the adjacency matrix of a digraph have the 
form 


e C(i,j) + *C(j,i), 

where e C is one among the edges centralities used in the previous section to work in 
the directed case. 

6. Numerical tests. The numerical tests have been performed on five networks, 
which come from three sources. The small network GD95b comes from the Univer¬ 
sity of Florida sparse matrix collection [7] and represents entries in a graph drawing 
context. The citation network cit-HepTh, the largest one in our data set, also comes 
from the University of Florida sparse matrix collection [7]. The networks Abortion 
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Table 2 

Description of the dataset. 


NETWORK 

n 

m 

T 

<r 1 

&2 

(Tl — CT2 

GD95b 

73 

96 

5160 

4.79 

4.37 

0.428 

Comp. Complexity 

857 

1596 

731996 

10.93 

9.87 

1.05 

Abortion 

2262 

9624 

5104728 

31.91 

20.04 

5.87 

Twitter 

3656 

188712 

13176871 

189.15 

120.54 

68.71 

cit-HepTh 

27400 

352547 

3730367 

85.16 

69.31 

15.85 


\ c 






Fig. 2. Evolution of Tf, C and T a C for the network GD95b when 25 edge modifications are per¬ 
formed working on the matrix A associated with the digraph: updates (top) and downdates (bottom). 


and Computational Complexity are small web graphs consisting of web sites on the 
topic of abortion and computational complexity. They are available online at [25]. 
Finally, the network Twitter can be found at [14]; it contains mentions and retweets 
of some part of the social network Twitter. Table 2 summarizes some properties of 
the networks in our dataset; namely, it contains the number of nodes n and edges m, 
the two largest singular values of the adjacency matrix or, <72, their difference or — 02 , 
and the number of virtual edges r. An exception is the network cit-HepTh, for which 
r is the number of virtual edges contained in the subgraph of the network constructed 
as described at the end of section 5. 

The small network is used to compare the effectiveness of the proposed heuristics 
with a “brute force” approach where each virtual edge is added in turn and the change 
in total communicability is monitored in order to find the “optimal” choice. Since 
we are tracking not one but two quantities, ThC(A) and T a C(A), we monitor both 
ThC(A) + T a C(A) and ThC(A) ■ T a C{A) and choose the optimal edge for either one 
of them. These methods are labeled as opt sum and opt prod, respectively. We 
perform a similar set of experiments for the downdating. As a baseline method, we 
also report results for a random selection of the edges in all our tests. The random 
















18 


Francesca Arrigo and Michele Benzi 




Fig. 3. Evolution of T^C and T a C for the network GD95b when 25 symmetric edge modifi¬ 
cations are performed working on the matrix A associated with the digraph. The optimal methods 
refer to the rank-one selection of the modifications 


methods are labeled as random or b: random, depending on whether we work on the 
matrix A or on 

In Fig. 2 we show plots of the total communicabilities X^C and T a C when up to 
K = 25 edge modifications are performed. We limit ourselves to the results for the 
heuristics based on the original digraph (matrix A). The results show that the heuris¬ 
tic gTC performs as well as the “optimal” choice based on brute force, while of course 
being much less expensive, in tackling both the updating and downdating problem. 
Note moreover that the performance of the methods HITS and gTC is different for 
this network. This result agrees with what one would expect, in view of the small 
gap <t i — <7 2 of the adjacency matrix under study. When considering the problem 
of downdating, on the other hand, all the methods perform well. In particular we 
want to stress again the excellent performance of the method gTC. The only exception 
is perhaps the heuristic eig, whose performance for the first 5 steps is comparable 
with the random choice. This result confirms our claim that this heuristic, which was 
shown in [1] to work very well for undirected networks, is not a good approach in the 
directed case. 

In Fig. 3 we display the evolution of the total communicability indices under 
rank-two updates. In this plot we retain the same names for the techniques as used in 
case of the rank-one modifications; however, the quantities used to derive the rankings 
are defined as in subsection 5.1. In this figure, each step corresponds to a rank-two 
symmetric modification, for the heuristic based on the edge centrality measures, and 
to two rank-one modifications, for the optimal methods. Thus, the plots for the 
optimal methods coincide with those in Fig. 2. The results displayed in Fig. 3 tell us 
that the symmetric rank-two modifications of the matrix may not lead to results as 
good as those obtained with the rank-one updates. Indeed, for both the total hub and 
authority communicabilities we have at least three methods in Fig. 2 that outperform 
all the methods used in Fig. 3. For this reason, we have not further investigate this 
approach. 

The results on the small network give us confidence that at least some of our 
proposed heuristic do a very good job at enhancing the communicability properties 
of digraphs. In the remaining tests we concentrate on the larger four networks, for 
which the “optimal,” brute force approaches are not practical. All experiments were 
performed using MATLAB Version 8.0.0.783 (R2012b) on an IBM ThinkPad running 
Ubuntu 14.04 LTS, a 2.5 GHZ Intel Core i5 processor, and 3.7 GiB of RAM. 

Figs. 4-7 display the evolution of the total hub and communicability centrality 
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Fig. 4. Evolution ofT^C and T a C for the network Computational Complexity when 200 updates 
are selected working on the matrix A associated with the digraph (top) and on its bipartite version 
srf (bottom). 






Fig. 5. Evolution of T^C and T a C for the network Abortion when 200 updates are selected 
working on the matrix A associated with the digraph (top) or on its bipartite version srf (bottom). 
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Fig. 6. Evolution of T^C and T a C for the network Twitter when 200 updates are selected 
working on the matrix A associated with the digraph (top) and on its bipartite version srf (bottom). 






Fig. 7. Evolution of T^C and T a C for the network cit-HepTh when 200 updates are selected 
working on the matrix A associated with the digraph (top) or on its bipartite version srf (bottom). 
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Table 3 

Timings in seconds when K = 200 updates are selected for the networks in our Dataset using 
the methods described. 



Computational 

Complexity 

Abortion 

Twitter 

cit-HepTh 

eig 

12.51 

53.27 

139.82 

217.12 

eig.no 

0.13 

0.73 

1.75 

1.33 

TC 

114.67 

62.22 

187.22 

163.55 

TC.no 

0.61 

0.76 

2.19 

1.02 

HITS 

8.35 

50.69 

133.50 

88.82 

HITS.no 

0.09 

0.63 

1.69 

0.67 

gTC 

10.63 

59.31 

183.48 

205.17 

gTC.no 

0.12 

0.68 

1.77 

1.28 

b:eig 

9.35 

52.43 

134.03 

99.70 

bieig.no 

0.21 

0.69 

1.66 

0.88 

b:deg 

11.00 

85.06 

256.66 

84.95 

b:TC 

11.39 

59.31 

154.97 

139.50 

b:TC.no 

0.11 

0.72 

1.67 

0.82 


(rescaled by the number of edges in the network) when K = 200 updates are selected 
using the criteria previously introduced. The plots at the top of each figure display the 
evolution of the total hub communicability (left) and total authority communicability 
(right) when the digraph is modified using the techniques developed for the directed 
case. The bottom plots show the evolution of the two indices obtained when the 
modifications are selected by working on sA. As expected, the proposed heuristics are 
dramatically better than the random choice. 

The results show that the heuristics b:eig(.no) and b:TC(.no) perform sim¬ 
ilarly to HITS(.no) and gTC(.no). The methods eig(.no) and TC(.no) display 
erratic behavior and often perform very poorly, as shown in Figs. 4, 5, and 7. The 
method eig( .no) also suffers from the restriction that the dominant eigenvalue must 
be simple, which is not always true in practice. Likewise, the performance of b : deg is 
generally unsatisfactory, with the exception of T a C(A ) for the network Computational 
Complexity where it outperforms the other techniques (see Fig. 4). Overall, consid¬ 
ering also the timings (see Table 3), the best performance is displayed by the heuris¬ 
tics gTC(.no) and HITS (.no) and by their undirected counterparts b:TC(.no) and 
b:eig(.no). The only possible exception is the Computational Complexity network, 
for which the heuristics for the directed case outperform those for the undirected, 
bipartite counterpart. 

The disagreement between the results for the heuristics labeled HITS and b:eig 
for the network Computational Complexity is at first sight puzzling. The two criteria 
should lead to the same edge selection and therefore to the same results, since the 
principal eigenvector of is qi = (uf, vf ) T and thus q\(i) = u\(i) and qi(j') = v\ (j) 
in the definition of the heuristic b:eig. However, if at least two edges have the same 
centrality score when working with b : eig and HITS, then the two methods may select 
different edges. In this case, after the edge modification has been performed, the 
adjacency matrices manipulated by the two methods are different, thus causing the 
difference we observe in Fig. 4. The difference will be more pronounced if the tie 
between edges occur at the beginning of the modification process. 

Table 3 contains the timings (in seconds) employed for the selection of the K = 
200 virtual edges to be updated. The heuristics used were implemented using mostly 
built-in MATLAB functions, such as the function eigs used for computing the largest 
eigenvalue. For the heuristics requiring the computation of a matrix function times 
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Fig. 8. Evolution ofT^C and T a C for the network Computational Complexity when 200 down- 
dates are selected working on the matrix A associated with the digraph (top) and on its bipartite 
version srf (bottom). 


a vector we used the code funm_kryl by S. Giittel [17]. To implement the degree- 
based heuristic we wrote our own code, which is far from optimal when compared to 
the other ones. The relatively high timings reported for this heuristic can likely be 
reduced with a more careful implementation. When interpreting the results, it has 
to be kept in mind that the size r of the set of virtual edges can be pretty large (cf. 
Table 2). We have observed that, for all the methods, roughly half of the reported 
computing time is spent in the computation of the products used in the definitions 
of the edge centrality measures. Nevertheless, the timings range from very small to 
moderate in all cases, showing the feasibility of the proposed heuristics. 

Among all the methods we tested on directed networks for the updating prob¬ 
lem, the best performance is displayed by HITS(.no), gTC(.no), b:eig(.no) and 
b:TC(.no) with the methods that manipulate A having the edge when <ti — 02 is 
small. Due to its erratic behavior, we cannot recommend the use of b : deg in general. 

Similar conclusions can be drawn when considering the results for the downdating 
problem, although the differences among the techniques are less pronounced (Figs. 8- 
11 and Table 4). Indeed, the results shown confirm the effectiveness of the techniques 
based on the edge HITS and total communicability centralities and of their variants 
which do not require the recomputation of the rankings. As in the case of the updat¬ 
ing problem, the results returned by these two methods essentially reproduce those 
obtained when working on using the heuristics b : eig(.no) and b : TC(. no). 

The methods eig( .no) and TC( .no) perform no better (and in some cases worse) 
than gTC(.no) and HITS(.no), while b:deg is usually outperformed by b:eig(.no) 
or b : TC( .no). 

Concerning the timings, if we compare the results in Tables 3 and 4 we can see 
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Fig. 9. Evolution of ThC and T a C for the network Abortion when 200 downdates are selected 
working on the matrix A associated with the digraph (top) and on its bipartite version srf (bottom). 






Fig. 10. Evolution of T^C and T a C for the network Twitter when 200 downdates are selected 
working on the matrix A associated with the digraph (top) and on its bipartite version srf (bottom). 
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Fig. 11. Evolution ofT^C and T a C for the network cit-HepTh when 200 downdates are selected 
working on the matrix A associated with the digraph. 


Table 4 

Timings in seconds when K = 200 downdates are selected for the networks in our Dataset using 
the methods described. 



Computational 

Complexity 

Abortion 

Twitter 

cit-HepTh 

eig 

5.83 

7.77 

16.65 

201.29 

eig.no 

0.04 

0.04 

0.05 

1.03 

TC 

104.12 

15.05 

75.35 

126.18 

TC.no 

0.92 

0.07 

0.39 

0.73 

HITS 

2.72 

4.71 

13.32 

63.40 

HITS.no 

0.02 

0.02 

0.08 

0.34 

gTC 

5.49 

12.06 

60.77 

175.02 

gTC.no 

0.04 

0.08 

0.29 

0.80 

b:eig 

4.31 

6.63 

15.10 

85.87 

b:eig.no 

0.03 

0.04 

0.05 

1.89 

b:deg 

0.06 

0.15 

3.94 

8.37 

b:TC 

5.51 

11.66 

39.02 

126.44 

b:TC.no 

0.02 

0.05 

0.16 

0.42 


that the values in Table 3 are in general higher that those in Table 4. This is easily 
understood in view of what we observed before, if one compares the number of virtual 
edges r with the number of edges m in each network in the dataset (see Table 2). 

While we do not provide a formal assessment of the computational cost of the 
various heuristics, arguments similar to those found in [1] indicate that the cost of 
the more efficient heuristics can be expected to scale approximately like 0(n ) or 
O(nlogn) with the number of nodes n. 

In conclusion, by considering the overall performance of the methods and their 
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cost (in terms of timings), we find that the best criteria for our updating/downdating 
goals are the methods HITS (.no) and gTC(.no). Besides these, satisfactory results 
may also be obtained using b : eig( .no) or b : TC( .no) . From the timings in Tables 3 
and 4 we can deduce that the heuristics HITS (.no) are in general slightly faster 
than b:eig(.no) and may thus be preferred. Concerning whether it is better to use 
gTC(.no) or b:TC(.no), we anticipate that the first will be preferrable when used 
in conjunction with fast algorithms for the approximation of bilinear forms involving 
generalized matrix functions. 

7. Conclusions and future work. In this work we have extended the notion 
of total network communicability to directed graphs, and developed heuristics for 
manipulating an existing directed network so as to enhance its communicability prop¬ 
erties. In doing so we made use of the concept of alternating walks, which allows us 
to take into account the dual role played by each node in a digraph, namely, receiver 
and broadcaster of information. This in turn led us in a natural way to the (rather 
overlooked) concept of generalized matrix function, first introduced in [18]. As shown 
in the paper, this concept allows one to express various communicability measures for 
digraphs in a compact form. 

Our computational results indicate that the heuristics which take into account 
the dual role of nodes in directed networks tend to be preferable to those that do not. 
We also showed that these heuristics are very fast in practice. 

Future work will address computational issues for large-scale networks (in partic¬ 
ular, fast algorithms for estimating the row and column sums of generalized matrix 
functions). Another avenue for future work is the extension of the techniques in 
this paper and in [1] to weighted graphs. We also plan to investigate the use of our 
heuristics to tune other network properties. Preliminary tests suggest that our edge 
modification techniques are effective at increasing the synchronizability in directed 
graphs (see, e.g., [28]). A more systematic exploration of this application is left for 
future work. 
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offered in 2015, when part of this work was completed. 
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